\section{Minimize the Frobenius Norm of the Reconstruction Error}

In short, given matrix \textbf{U}, we want to find an orthogonal projection matrix \textbf{B} that minimize the following error

\begin{equation}\nonumber
\zeta = \|\textbf{U} - \textbf{BB}^T\textbf{U}\|_{F}^2
\end{equation}

Such Frobenius norm can be reformulated as following,

\begin{equation}\nonumber
\begin{split}
\zeta &= \|\textbf{U} - \textbf{BB}^T\textbf{U}\|_{F}^2 = tr[(\textbf{U} - \textbf{BB}^T\textbf{U})(\textbf{U} - \textbf{BB}^T\textbf{U})^T]\\
&= tr[\textbf{UU}^T - \textbf{UU}^T\textbf{BB}^T - \textbf{BB}^T\textbf{UU}^T + \textbf{BB}^T\textbf{UU}^T\textbf{BB}^T]\\
&= tr[\textbf{UU}^T] - tr[\textbf{UU}^T\textbf{BB}^T] - tr[\textbf{BB}^T\textbf{UU}^T] + tr[\textbf{BB}^T\textbf{UU}^T\textbf{BB}^T]\\
&= tr[\textbf{UU}^T] - tr[\textbf{UU}^T\textbf{BB}^T] - tr[\textbf{UU}^T\textbf{BB}^T] + tr[\textbf{BB}^T\textbf{BB}^T\textbf{UU}^T]\\
&= tr[\textbf{UU}^T] - 2tr[\textbf{UU}^T\textbf{BB}^T] + tr[\textbf{BI}\textbf{B}^T\textbf{UU}^T]\\
&= tr[\textbf{UU}^T] - 2tr[\textbf{UU}^T\textbf{BB}^T] + tr[\textbf{UU}^T\textbf{B}\textbf{B}^T]\\
&= tr[\textbf{UU}^T] - tr[\textbf{UU}^T\textbf{BB}^T]\\
&= tr[\textbf{UU}^T] - tr[\textbf{B}^T\textbf{UU}^T\textbf{B}]
\end{split}
\end{equation}

Notice $tr[\textbf{UU}^T]$ is the sum of all the eigen values of matrix $\textbf{UU}^T$, which is nonegative as $\textbf{UU}^T$ is semi positive definite. $\textbf{B}^T\textbf{UU}^T\textbf{B}$ is also symmetric and thus semi positive definite. As a result, its trace is also nonegative. Moreover, $\zeta$ is also nonegative. Thus we have

\begin{equation}\nonumber
0 \leq tr[\textbf{B}^T\textbf{UU}^T\textbf{B}] \leq tr[\textbf{UU}^T]
\end{equation}

Thus to minimize $\zeta$ we need to maximize $tr[\textbf{B}^T\textbf{UU}^T\textbf{B}]$ as $tr[\textbf{UU}^T]$ is fixed. Then the problem becomes to a constrained optimization problem. To see it more clearly, let's define $\textbf{B} = [\textbf{b}_1, \textbf{b}_2, \ldots , \textbf{b}_m]$. And we have the following

\begin{equation}\nonumber
tr[\textbf{B}^T\textbf{UU}^T\textbf{B}] = \sum_{i=1}^{m}\textbf{b}_i^T\textbf{UU}^T\textbf{b}_i
\end{equation}

Then we can formulate our optimization problem as following

\begin{equation}\nonumber
\begin{split}
\textbf{maximize: }& f(\textbf{b}_1, \textbf{b}_2, \ldots, \textbf{b}_m) = \sum_{i=1}^{m}\textbf{b}_i^T\textbf{UU}^T\textbf{b}_i\\
\textbf{s.t. }& \textbf{b}_i^T\textbf{b}_j = \delta_{ij}, \textit{ i, j = 1, 2, ..., m}
\end{split}
\end{equation}

$\delta$ is the Kronecker delta. To solve such an optimization problem, we are going to use Lagrange multiplier.

\begin{equation}\nonumber
\begin{split}
L(\textbf{b}_1, \textbf{b}_2, \ldots, \textbf{b}_m) &= \sum_{i=1}^{m}\textbf{b}_i^T\textbf{UU}^T\textbf{b}_i + \sum_{i=1}^{m}\lambda_i(1 - \textbf{b}_i^T\textbf{b}_i) - \sum_{i \neq j}\mu_{ij}\textbf{b}_i^T\textbf{b}_j\\
\frac{\partial L}{\partial \textbf{b}_i} &= 2\textbf{UU}^T\textbf{b}_i - 2\lambda_i\textbf{b}_i - \sum_{i \neq j}\mu_{ij}\textbf{b}_j = 0
\end{split}
\end{equation}

So far the problem is still hard to solve especially when we take the nonlinear constraints into account. However, we can solve a less constrained problem instead getting rid of constraints $\textbf{b}_{i}^{T}\textbf{b}_{j} = 0, i\neq j$. And then we have

\begin{equation}\nonumber
\begin{split}
\frac{\partial L}{\partial \textbf{b}_i} = 2\textbf{UU}^T\textbf{b}_i &- 2\lambda_i\textbf{b}_i = 0\\
\textbf{UU}^T\textbf{b}_i &= \lambda_i\textbf{b}_i 
\end{split}
\end{equation}

This means we can obtain optimal for the less constrained problem when $\textbf{b}_i$ are eigen vectors of $\textbf{UU}^T$. Since those eigen vectors also satisfy the orthogonal constraints, they are also optimal solutions to our original problem. To maximize our objective, we simply pick the first m eigen vectors corresponding to the largest m eigen values, when our maximum is the sum of largest m eigen values.